Johnson-lindenstrauss Dimensionality Reduction on the Simplex
نویسندگان
چکیده
We propose an algorithm for dimensionality reduction on the simplex, mapping a set of high-dimensional distributions to a space of lower-dimensional distributions, whilst approximately preserving pairwise Hellinger distance between distributions. By introducing a restriction on the input data to distributions that are in some sense quite smooth, we can map n points on the d-simplex to the simplex of O(ε−2 log n) dimensions with ε-distortion with high probability. The techniques used rely on a classical result by Johnson and Lindenstrauss on dimensionality reduction for Euclidean point sets and require the same number of random bits as non-sparse methods proposed by Achlioptas for database-friendly dimension-
منابع مشابه
Dimensionality Reduction on the Simplex
For many problems in data analysis, the natural way to model objects is as a probability distribution over a finite and discrete domain. Probability distributions over such domains can be represented as points on a (high-dimensional) simplex, and thus many inference questions involving distributions can be viewed geometrically as manipulating points on a simplex. The dimensionality of these poi...
متن کاملThe Fast Johnson-lindenstrauss Transform
While we omit the proof, we remark that it is constructive. Specifically, A is a linear map consisting of random projections onto subspaces of Rd. These projections can be computed by n matrix multiplications, which take time O(nkd). This is fast enough to make the Johnson-Lindenstrauss transform (JLT) a practical and widespread algorithm for dimensionality reduction, which in turn motivates th...
متن کاملGeometric Optimization April 12 , 2007 Lecture 25 : Johnson Lindenstrauss Lemma
The topic of this lecture is dimensionality reduction. Many problems have been efficiently solved in low dimensions, but very often the solution to low-dimensional spaces are impractical for high dimensional spaces because either space or running time is exponential in dimension. In order to address the curse of dimensionality, one technique is to map a set of points in a high dimensional space...
متن کاملThe Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction
For any n > 1 and 0 < ε < 1/2, we show the existence of an n-point subset X of R such that any linear map from (X, `2) to ` m 2 with distortion at most 1 + ε must have m = Ω(min{n, ε−2 logn}). Our lower bound matches the upper bounds provided by the identity matrix and the Johnson-Lindenstrauss lemma [JL84], improving the previous lower bound of Alon [Alo03] by a log(1/ε) factor.
متن کاملEnergy-aware adaptive Johnson-Lindenstrauss embedding via RIP-based designs
We consider a dimensionality reducing matrix design based on training data with constraints on its Frobenius norm and number of rows. Our design criteria is aimed at preserving the distances between the data points in the dimensionality reduced space as much as possible relative to their distances in original data space. This approach can be considered as a deterministic Johnson-Lindenstrauss e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010